Mm-websom: a Variant of Websom Based on Order Statistics

نویسندگان

  • A. Georgakis
  • C. Kotropoulos
  • A. Xafopoulos
  • I. Pitas
چکیده

A variant of the WEBSOM architecture for information retrieval is proposed in this paper. WEBSOM is based on the self-organizing map that employs a linear LMS adaptation rule for updating the weight vector of each neuron. Accordingly, the weight vector converges asymptotically to the conditional cluster mean of the feature vectors assigned to the class represented by the weight vector of the neuron. We propose to replace the updating rule by employing the marginal median. The objective is to overcome the drawbacks of the standard technique in the presence of outliers in the training set and to use robust estimators of the reference vectors for each class. Experimental results demonstrate a superior performance of the proposed variant against the standard algorithm, in terms of the number of training iterations needed so that the mean square error (i.e., the average distortion) drops to the 1 e of its initial value. We provide precision-recall curves as a measure of the quality of the clustering procedur as well. Both techniques are tested using a corpus that comprises web pages selected over the Internet.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalizability of the WEBSOM Method to Document Collections of Various Types

WEBSOM is a method in which the self-organizing map algorithm is used to automatically organize collections of documents on a map to enable easy exploration of the collection. This article illustrates with case studies how collections of various types of text can be successfully organized using the WEBSOM. The emphasis is on describing the particular challenges that each type of material poses,...

متن کامل

Statistical Aspects of the WEBSOM System in Organizing Document Collections

WEBSOM is a novel method for organizing document collections onto map displays to enhance the interactive browsing and retrieval of the documents. The map is organized automatically according to the contents of the full-text documents by the Self-Organizing Map algorithm. The map display provides a visual overview of the whole document collection. The overview, the map display , aids in the exp...

متن کامل

Modelling Climate Change Effects on Wine Quality Based on Expert Opinions Expressed in Free-Text Format: The WEBSOM Approach

The motivation for modelling the effects of climate change on viticulture and wine quality using both qualitative and quantitative data within an integrated analytical framework is described. The major constraints and solutions evident when taking such an approach are outlined. WEBSOM is a novel self-organising map (SOM) method for extracting relevant domain-dependent characteristics from web b...

متن کامل

Mining massive document collections by the WEBSOM method

A viable alternative to the traditional text-mining methods is the WEBSOM, a software system based on the Self-Organizing Map (SOM) principle. Prior to the searching or browsing operations, this method orders a collection of textual items, say, documents according to their contents, and maps them onto a regular twodimensional array of map units. Documents that are similar on the basis of their ...

متن کامل

Very Large Two-Level SOM for the Browsing of Newsgroups

On January 19, 1996 we published in the Internet a demo of how to use Self-Organizing Maps (SOMs) for the organization of large collections of full-text les. Later we added other newsgroups to the demo. It can be found at the address http://websom.hut../websom/. In the present paper we describe the main features of this system, called the WEBSOM, as well as some newer developments of it.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001